Code
dag <- dagitty("
dag {
EC -> EQ
EC -> EU -> T
EC -> GDP
EU -> T
T -> EQ
T -> GDP
G -> GDP
}
")
ggdag(dag, text = TRUE) +
theme_void()December 3, 2025
Can a country reduce emissions while growing its GDP? This question has gained significant attention in recent years as all countries globally are trying to achieve the economic development with maintaining environmental sustainability. In this assignment, we examine global power-sector emissions and GDP data from more than 200 countries to explore whether sustained economic growth can occur alongside meaningful emission reductions.
Economic growth and greenhouse gas emissions have traditionally been closely linked. The increase in GDP is often accompanied by higher emissions rate due to higher energy consumption, increasing industrial activities and infrastructure growth. However, the recent trend shows that some countries are beginning to break this pattern. By analyzing emissions alongside GDP, we can identify which countries are achieving ‘decoupling’ where economic growth increases without a corresponding rise in emissions. This decoupling can be a result of combination of multiple factors such as clean energy, improvements in energy efficiency, technological innovation and effective climate policies. The study of these trends provides valuable insights into how countries can pursue sustainable development, balancing economic progress with environmental responsibility. It can serve as a guide for other nations who are aiming to reduce their carbon footprint while maintaining their economic growth.
The following DAG describes the key relationships between emissions and GDP. It shows how the Economy (EC), Energy use (EU), Technology (T) and Governance (G) interact to shape both Emissions quantity (EQ) and GDP. The economy influences energy use, technological development, emissions and GDP. Energy use also contributes to technological change, which then affects both emissions and GDP. Governance has a direct impact on GDP as well. Overall, the DAG highlights the direct and indirect connections among these variables and helps identify which variable need to be controlled in the analysis to reduce confounding and selection bias.
| Variables | Abbreviation |
|---|---|
| Economy | EC |
| Emissions Quantity | EM |
| Energy use | EU |
| Technology | T |
| Governance | G |
| GDP Growth | GDP |
Here, we will work with the following data sets:
GDP data
The data set comes from the World Bank and uses the indicator NY.GDP.MKTP.CD, which reports national gross domestic product (GDP) in current U.S. dollars. It measures the total value of all goods and services produced in a country in a given year, converted into U.S. dollars using that year’s official exchange rate. As it is a ‘current’ GDP, the values reflect the prices and exchange rates of the respective year, not adjusted for inflation or purchasing-power differences.
Global emissions data
This data set is retrieved from Climate TRACE. It provides a detailed, open-access global inventory of greenhouse gas and air pollution emissions. It aggregates data from hundreds of millions of emission sources worldwide including power plants, industrial facilities, transportation, agriculture, and more, allowing emissions to be broken down by country, sector, sub-sector and time period. The data cover emissions over multiple years, starting from 2015 for annual country-level data, and with monthly and source level records available since 2021 .
In this analysis, we focus specifically on the power sector data, which provides detailed emissions estimates, allowing us to examine trends and impacts within one of the largest sources of global greenhouse-gas emissions.
Question: Is it possible to achieve emission reductions while maintaining GDP growth?
H0: There is no significant relationship between GDP growth and emission reductions.
HA: There is significant relationship between GDP growth and emission reductions.
In this project, we will examine whether it is possible for countries to achieve emission reductions while maintaining GDP growth. To evaluate this relationship, one variable must increase while the other decreases. Additionally, the global emissions and GDP data exhibit clear signs of over dispersion, making the Negative Binomial Model an appropriate choice for analysis.
Before fitting the actual data to the model, we will simulate data to conform the Negative Binomial Model. This will help us understand the model and ensure that it behaves as expected prior to applying it to real world data set.
The statistical notation for the Negative Binomial Model.
\[ \begin{align} \text{BinaryOutcome} &\sim NegativeBinomial(\mu, \theta) \\ log(\mu) &= \beta_0 + \beta_1 \text{Predictor} \end{align} \\ \]
Fit a logistic model like this:
model <- glm.nb(binary_outcome ~ predictor, data = my_data)
# Set the parameters
n <- 10000 # Data set size
beta0 <- 0.6
beta1 <- 1.8
r <- 2 # Dispersion
# Uniform distribution between 1 and 10
x <- runif(n, min = 1, max = 10)
# Calculate mean (mu) using the Negative Binomial link function (log link)
mu <- exp(beta0 + beta1 * x)
# Simulate the dependent variable (y)
y <- rnbinom(n = n, mu = mu, size = r)
# Create a data set
my_data <- data.frame(x, y)
# Fit a negative binomial model
negbinomial <- glm.nb(y ~ x, data = my_data)library(ggplot2)
# Create predictions from the model
my_data$prediction <- predict(negbinomial, type = "response")
# Simple scatter plot with fitted line
ggplot(my_data, aes(x = x, y = y)) +
geom_point(alpha = 0.8, size = 0.2, color = "hotpink") +
geom_line(aes(y = prediction), color = "darkgreen", linewidth = 1) +
labs(title = "Negative Binomial Model",
x = "X axis",
y = "Y axis") +
theme_minimal()Let’s check the coefficients of this model.
| Negative Binomial | |
|---|---|
| (Intercept) | 0.574 |
| (0.017) | |
| x | 1.803 |
| (0.003) | |
| Num.Obs. | 10000 |
| AIC | 228568.0 |
| BIC | 228589.6 |
| Log.Lik. | -114280.989 |
| RMSE | 14022924.96 |
Here, let’s check the dispersion parameter that controls the amount of over dispersion in the data.
# Read the data for power sector
electricity <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'electricity-generation_country_emissions_v5_1_0.csv'),
na = c(" ", "0", "0.0", "NA")) %>%
clean_names()
heat_plants <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'heat-plants_country_emissions_v5_1_0.csv'),
na = c(" ", "0", "0.0", "NA")) %>%
clean_names()
other_energy <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'other-energy-use_country_emissions_v5_1_0.csv'),
na = c(" ", "0", "0.0", "NA")) %>%
clean_names()
# Read the GDP data set
gdp <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'API_NY.GDP.MKTP.CD_DS2_en_csv_v2_280632', 'API_NY.GDP.MKTP.CD_DS2_en_csv_v2_280632.csv'),
na = c(" ", "NA"),
skip = 4) %>%
clean_names()Combine and clean power sub sector data for yearly analysis.
# Combine all three sub sector data sets
power <- bind_rows(electricity, heat_plants, other_energy)
# Make a new column 'year' to segregate according to year
power_year <- power %>%
mutate(year = year(power$start_time))
# Rename the column and filter year 2025
power_year_cleaned <- power_year %>%
# Rename the column to match with gdp dataset
rename(country_code = iso3_country) %>%
# Filter out the year 2025 (no data of 2025 available for GDP)
filter(year != 2025)Clean and reshape GDP data for further analysis.
# Convert data to long format
gdp_long<- gdp %>%
# Drop 'x70' column and columns from x1960 to x2014
dplyr::select(-x70, -x1960:-x2014) %>%
pivot_longer(cols = starts_with("x"),
names_to = "year",
values_to = "gdp_values") %>%
# Remove 'x' from the 'year' string
mutate(year = stringr::str_replace(year, pattern = "x", replacement = ""),
# Convert the resulting string to a numeric data type
year = as.numeric(year))Join power sub sector and GDP data for a streamlined structure and easy analysis.
# Join the data frame and reorder the columns
power_gdp <- power_year_cleaned %>%
left_join(gdp_long, by = c("country_code", "year")) %>%
# Filter rows where 'country_name' column is empty
filter(!is.na(country_name), country_name != "") %>%
# Reorder the columns by position
dplyr::select(13, 1, 12, everything())# Calculate the mean emissions of countries for each year (2015 - 2024)
power_gdp_mean <- power_gdp %>%
drop_na(emissions_quantity) %>%
group_by(country_name, year) %>%
summarize(mean = mean(emissions_quantity),
.groups = 'drop') %>%
arrange(desc(mean))
# Top 10 countries by overall mean emissions for all years
top_10_countries <- power_gdp_mean %>%
group_by(country_name) %>%
summarize(overall_mean = mean(mean), .groups = 'drop') %>%
arrange(desc(overall_mean)) %>%
slice_head(n = 10)
# Filter to keep only top 10 countries
power_gdp_mean_top10 <- power_gdp_mean %>%
filter(country_name %in% top_10_countries$country_name)Visualize the top ten highest emitting countries.
ggplot(power_gdp_mean_top10, aes(x = year, y = mean, color = country_name)) +
geom_line() +
scale_x_continuous(breaks = seq(2015, 2024, by = 1)) +
scale_y_continuous(labels = scales::label_number(scale = 1e-6, suffix = "M t")) +
labs(x = "Year",
y = "CO2 emission per metric tonnes",
title = "Top ten highest emitting countries from 2015 to 2024",
color = "Country") +
theme_classic()# Aggregate GDP and emissions by country and year
country_totals <- power_gdp %>%
group_by(country_name, country_code, subsector, year) %>%
summarize(total_emissions = sum(emissions_quantity, na.rm = TRUE),
gdp_values = first(gdp_values),
.groups = 'drop')
# Calculate annual changes in GDP and emissions
annual_changes <- country_totals %>%
arrange(country_code, year) %>%
group_by(country_name, country_code) %>%
mutate(emissions_pct_change = (total_emissions - lag(total_emissions)) / lag(total_emissions) * 100,
gdp_pct_change = (gdp_values - lag(gdp_values)) / lag(gdp_values) * 100) %>%
ungroup()# Find the cases where emissions decreased while GDP increased
decoupling <- annual_changes %>%
filter(emissions_pct_change < 0 & gdp_pct_change > 0) %>%
dplyr::select(country_name, subsector, year, emissions_pct_change, gdp_pct_change)
# Find countries where emissions decreased while GDP increased
decoupling_case <- decoupling %>%
group_by(country_name) %>%
summarize(decoupling_years = n(),
avg_emission_reduction = mean(emissions_pct_change),
avg_gdp_growth = mean(gdp_pct_change),
.groups = 'drop') %>%
arrange(desc(decoupling_years))# Filter the top ten countries with decoupling effects
decoup_ten <- decoupling_case %>%
arrange(desc(decoupling_years)) %>%
slice_head(n = 10)
# Plot a graph for top ten countries
ggplot(decoup_ten, aes(x = reorder(country_name, decoupling_years),
y = decoupling_years)) +
geom_col(fill = "darkolivegreen4") +
geom_text(aes(label = decoupling_years), hjust = -0.3, size = 3.5) +
coord_flip() +
theme_classic() +
labs(x = "Top ten countries",
y = "Number of years with decoupling",
title = "Top 10 countries: Years of emission reduction while increasing GDP") +
ylim(0, max(decoup_ten$decoupling_years) * 1.1)# Predictions for GDP growth (holding emission reduction constant)
pred_gdp <- decoupling_case %>%
mutate(avg_emission_reduction = mean(decoupling_case$avg_emission_reduction),
variable = "GDP growth") %>%
mutate(pred = predict(nb_model, newdata = .))
# Predictions for emission reduction (holding GDP growth constant)
pred_emission <- decoupling_case %>%
mutate(avg_gdp_growth = mean(decoupling_case$avg_gdp_growth),
variable = "Emission Reduction") %>%
mutate(pred = predict(nb_model, newdata = .))
# Combine predictions for GDP and emissions
pred_combined <- bind_rows(pred_gdp, pred_emission)# Plot the GDP and emissions
ggplot() +
geom_point(data = decoupling_case,
aes(x = avg_gdp_growth, y = decoupling_years, shape = "GDP growth"),
size = 0.5, alpha = 0.7, color = "hotpink") +
geom_point(data = decoupling_case,
aes(x = avg_emission_reduction, y = decoupling_years, shape = "Emissions reduction"), size = 0.5, alpha = 0.7, color = "darkgreen") +
geom_line(data = pred_gdp,
aes(x = avg_gdp_growth, y = pred, color = "GDP growth"),
linewidth = 0.8) +
geom_line(data = pred_emission,
aes(x = avg_emission_reduction, y = pred, color = "Emissions reduction"),
linewidth = 0.8) +
scale_color_manual(values = c("GDP growth" = "hotpink",
"Emissions reduction" = "darkgreen"),
name = "Predictor") +
scale_shape_manual(values = c("GDP growth" = 12, "Emissions reduction" = 12),
name = "Predictor") +
theme_classic() +
theme(legend.position = "right",
plot.title = element_text(face = "bold", size = 14)) +
labs(x = "Predictor value (%)",
y = "Decoupling years",
title = "A decade of decoupling: Rising GDP and Declining Emissions",
subtitle = "Straight lne shows model predictions & Shape shows actual observations")@online{poudel2025,
author = {Poudel, Aakriti},
title = {Global {Emissions} and {GDP}},
date = {2025-12-03},
url = {https://aakriti-poudel-chhetri.github.io/posts/2025-12-global-emissions-gdp/},
langid = {en}
}